86 research outputs found

    Polyphonic Sound Event Detection by using Capsule Neural Networks

    Full text link
    Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. Nowadays, Deep Learning offers valuable techniques for this goal such as Convolutional Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has been recently introduced in the image processing field with the intent to overcome some of the known limitations of CNNs, specifically regarding the scarce robustness to affine transformations (i.e., perspective, size, orientation) and the detection of overlapped images. This motivated the authors to employ CapsNets to deal with the polyphonic-SED task, in which multiple sound events occur simultaneously. Specifically, we propose to exploit the capsule units to represent a set of distinctive properties for each individual sound event. Capsule units are connected through a so-called "dynamic routing" that encourages learning part-whole relationships and improves the detection performance in a polyphonic context. This paper reports extensive evaluations carried out on three publicly available datasets, showing how the CapsNet-based algorithm not only outperforms standard CNNs but also allows to achieve the best results with respect to the state of the art algorithms

    Multi-household energy management in a smart neighborhood in the presence of uncertainties and electric vehicles

    Get PDF
    none4noThe pathway toward the reduction of greenhouse gas emissions is dependent upon increasing Renewable Energy Sources (RESs), demand response, and electrification of public and private transportation. Energy management techniques are necessary to coordinate the operation in this complex scenario, and in recent years several works have appeared in the literature on this topic. This paper presents a study on multi-household energy management for Smart Neighborhoods integrating RESs and electric vehicles participating in Vehicle-to-Home (V2H) and Vehicle-to-Neighborhood (V2N) programs. The Smart Neighborhood comprises multiple households, a parking lot with public charging stations, and an aggregator that coordinates energy transactions using a Multi-Household Energy Manager (MH-EM). The MH-EM jointly maximizes the profits of the aggregator and the households by using the augmented ɛ-constraint approach. The generated Pareto optimal solutions allow for different decision policies to balance the aggregator’s and households’ profits, prioritizing one of them or the RES energy usage within the Smart Neighborhood. The experiments have been conducted over an entire year considering uncertainties related to the energy price, electric vehicles usage, energy production of RESs, and energy demand of the households. The results show that the MH-EM optimizes the Smart Neighborhood operation and that the solution that maximizes the RES energy usage provides the greatest benefits also in terms of peak-shaving and valley-filling capability of the energy demand.openLuca Serafini, Emanuele Principi, Susanna Spinsante, Stefano SquartiniSerafini, Luca; Principi, Emanuele; Spinsante, Susanna; Squartini, Stefan

    automatic detection of cry sounds in neonatal intensive care units by using deep learning and acoustic scene simulation

    Get PDF
    Cry detection is an important facility in both residential and public environments, which can answer to different needs of both private and professional users. In this paper, we investigate the problem of cry detection in professional environments, such as Neonatal Intensive Care Units (NICUs). The aim of our work is to propose a cry detection method based on deep neural networks (DNNs) and also to evaluate whether a properly designed synthetic dataset can replace on-field acquired data for training the DNN-based cry detector. In this way, a massive data collection campaign in NICUs can be avoided, and the cry detector can be easily retargeted to different NICUs. The paper presents different solutions based on single-channel and multi-channel DNNs. The experimental evaluation is conducted on the synthetic dataset created by simulating the acoustic scene of a real NICU, and on a real dataset containing audio acquired on the same NICU. The evaluation revealed that using real data in the training phase allows achieving the overall highest performance, with an Area Under Precision-Recall Curve (PRC-AUC) equal to 87.28%, when signals are processed with a beamformer and a post-filter and a single-channel DNN is used. The same method, however, reduces the performance to 70.61% when training is performed on the synthetic dataset. On the contrary, under the same conditions, the new single-channel architecture introduced in this paper achieves the highest performance with a PRC-AUC equal to 80.48%, thus proving that the acoustic scene simulation strategy can be used to train a cry detection method with positive results

    Improving knowledge distillation for non-intrusive load monitoring through explainability guided learning

    Get PDF
    Knowledge distillation (KD) is a machine learning technique widely used in recent years for the task of domain adaptation and complexity reduction. It relies on a Student-Teacher mechanism to transfer the knowledge of a large and complex Teacher network into a smaller Student model. Given the inherent complexity of large Deep Neural Network (DNN) models, and the need for deployment on edge devices with limited resources, complexity reduction techniques have become a hot topic in the Non-intrusive Load Monitoring (NILM) community. Recent literature in NILM has devoted increased effort to domain adaptation and architecture reduction via KD. However, the mechanism behind the transfer of knowledge from the Teacher to the Student is not clearly understood. In this work, we aim to address the aforementioned issue by placing the KD NILM approach in a framework of explainable AI (XAI). We identify the main inconsistency in the transfer of explainable knowledge, and exploit this information to propose a method for improvement of KD through explainability guided learning. We evaluate our approach on a variety of appliances and domain adaptation scenarios and demonstrate that solving inconsistencies in the transfer of explainable knowledge can lead to improvement in predictive performance

    Knowledge distillation for scalable non-intrusive load monitoring

    Get PDF
    Smart meters allow the grid to interface with individual buildings and extract detailed consumption information using Non-Intrusive Load Monitoring (NILM) algorithms applied to the acquired data. Deep Neural Networks, which represent the state-of-the-art for NILM, are affected by scalability issues since they require high computational and memory resources, and by reduced performance when training and target domains mismatched. This paper proposes a knowledge distillation approach for NILM, in particular for multi-label appliance classification, to reduce model complexity and improve generalisation on unseen data domains. The approach uses weak supervision to reduce labelling effort, which is useful in practical scenarios. Experiments, conducted on UK-DALE and REFIT datasets, demonstrated that a low-complexity network can be obtained for deployment on edge devices while maintaining high performance on unseen data domains. The proposed approach outperformed benchmark methods in unseen target domains achieving a F1-score 0.14 higher than a benchmark model 78 times more complex

    Optical constants modelling in silicon nitride membrane transiently excited by EUV radiation.

    Get PDF
    We hereby report on a set of transient optical reflectivity and transmissivity measurements performed on silicon nitride thin membranes excited by extreme ultraviolet (EUV) radiation from a free electron laser (FEL). Experimental data were acquired as a function of the membrane thickness, FEL fluence and probe polarization. The time dependence of the refractive index, retrieved using Jones matrix formalism, encodes the dynamics of electron and lattice excitation following the FEL interaction. The observed dynamics are interpreted in the framework of a two temperature model, which permits to extract the relevant time scales and magnitudes of the processes. We also found that in order to explain the experimental data thermo-optical effects and inter-band filling must be phenomenologically added to the model

    Task-Aware Separation for the DCASE 2020 Task 4 Sound Event Detection and Separation Challenge

    Get PDF
    International audienceSource Separation is often used as a pre-processing step in many signal-processing tasks. In this work we propose a novel approach for combined Source Separation and Sound Event Detection in which a Source Separation algorithm is used to enhance the Sound Even-Detection back-end performance. In particular, we present a permutation-invariant training scheme for optimizing the Source Separation system directly with the back-end Sound Event Detection objective without requiring joint training or fine-tuning of the two systems. We show that such an approach has significant advantages over the more standard approach of training the Source Separation system separately using only a Source Separation based objective such as Scale-Invariant Signal-To-Distortion Ratio. On the 2020 Detection and Classification of Acoustic Scenes and Events Task 4 Challenge our proposed approach is able to outperform the baseline source separation system by more than one percent in event-based macro F1 score on the development set with significantly less computational requirements

    Domain-Adversarial Training and Trainable Parallel Front-end for the DCASE 2020 Task 4 Sound Event Detection Challenge

    Get PDF
    International audienceIn this paper, we propose several methods for improving Sound Event Detection systems performance in the context of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 Task 4 challenge. Our main contributions are in the training techniques, feature pre-processing and prediction post-processing. Given the mismatch between synthetic labelled data and target domain data, we exploit domain adversarial training to improve the network generalization. We show that such technique is especially effective when coupled with dynamic mixing and data augmentation. Together with Hidden Markov Models prediction smoothing, by coupling the challenge baseline with aforementioned techniques we are able to improve event-based macro F1 score by more than 10% on the development set, without computational overhead at inference time. Moreover, we propose a novel, effective Parallel Per-Channel Energy Normalization front-end layer and show that it brings an additional improvement of more than one percent with minimal computational overhead
    corecore